Automating Transformations in Data Vault Data Warehouse Loads

نویسندگان

  • Mikko Puonti
  • Timo Raitalaakso
  • Timo Aho
  • Tommi Mikkonen
چکیده

Data warehousing is a process of integrating multiple data sources into one for, e.g., reporting purposes. An emerging modeling technique for this is the data vault method. The use of data vault creates many structurally similar data processing modifications in the transform phase of ETL work. Is it possible to automate the creation of transformations? Based on our study, the answer is mostly affirmative. Data vault modeling creates certain constraints to data warehouse entities. These model constraints and data vault table populating principles can be used to generate transformation code. Based on the original relational database model and data flow metadata we can gather populating principles. These can then be used to create general templates for each entity. Nevertheless, we need to note that the use of data flow metadata can be only partially automated and includes the only manual work phases in the process. In the end we can generate the actual transformation code automatically. In this paper, we carefully describe the creation of automation procedure and analyze the practical problems based on our experiences on PL/SQL proof of concept implementation. To the best of our knowledge, similar has not yet been described in the scientific literature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A direct approach to physical Data Vault design

The paper presents a novel agile approach to large scale design of enterprise data warehouses based on a Data Vault model. An original, simple and direct algorithm is defined for the incremental design of physical Data Vault type enterprise data warehouses, using source data meta-model and rules, and used in developing a prototype case tool for Data Vault design. This approach solves primary re...

متن کامل

ExTENSIBLE MARKUP LANGUAGE (xML) SCHEMAS FOR DATA VAULT MODELS

With the conceptualization of next generation architecture for Data Warehousing, that is the DW 2.0, there is now an increased emphasis on using fully-temporalized databases, in particular with approaches such as the Data Vault. In this paper we present a template XML Schema for Data Vault model concepts (at a metamodel level) and a process for creating XML Schemas for data warehouse designs re...

متن کامل

A Meta Data Vault Approach for Evolutionary Integration of Big Data Sets: Case Study Using the Ncbi Database for Genetic Variation

A data warehouse integrates data from various and heterogeneous data sources and creates a consolidated view of the data that is optimized for reporting and analysis. Today, business and technology are constantly evolving, which directly affects the data sources. New data sources can emerge while some can become unavailable. The DW or the data mart that is based on these data sources needs to r...

متن کامل

Data Warehouse and Master Data Management Evolution – a Meta-data-vault Approach

The paper presents a: a) brief overview and analysis of existing approaches to the data warehouse (DW) evolution problem, and b) detailed description of the research idea for the DW evolution problem (primarily intended for structured data sources and realizations with relational database engines). We observe the DW evolution problem as a double issue from the DW perspective, and from the maste...

متن کامل

Resumption of Interrupted Warehouse Loads

Data warehouses collect large quantities of data from distributed sources into a single repository. A typical load to create or maintain a warehouse processes GBs of data, takes hours or even days to execute, and involves many complex and user-de ned transformations of the data (e.g., nd duplicates, resolve data inconsistencies, and add unique keys). If the load fails, a possible approach is to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016